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THE LINEAGE PROCESS IN G ALTON WATSON TREES AND 
GLOBALLY CENTERED DISCRETE SNAKES 

By jEAN-FRANgOIS Marckert 

Universite Bordeaux 

We consider branching random walks built on Galton-Watson 
trees with offspring distribution having a bounded support, condi- 
tioned to have n nodes, and their rescaled convergences to the Brow- 
nian snake. We exhibit a notion of "globally centered discrete snake" 
that extends the usual settings in which the displacements are sup- 
posed centered. We show that under some additional moment condi- 
tions, when n goes to -f oo, "globally centered discrete snakes" con- 
verge to the Brownian snake. The proof relies on a precise study of 
the lineage of the nodes in a Galton-Watson tree conditioned by the 
size, and their links with a multinomial process [the lineage of a node 
u is the vector indexed by {k,j) giving the number of ancestors of 
u having k children and for which u is a descendant of the jth one]. 
Some consequences concerning Galton-Watson trees conditioned by 
the size are also derived. 

1. Introduction. 

1.1. A model of centered discrete snake. We first begin with the formal 
description of the notion of trees and branching random walks. 

Let U = {0} U Un>i be the set of finite words on the alphabet N* = 
{1, 2, . . .}. For u = ui . . .Un and = fi . . . fm G U, we let itu = ui . . . UnVi ■ ■ .Vm 
be the concatenation of the words u and v (by convention, 0u = u0 = u). 
Following Neveu [22], we call planar tree T a subset of U containing the 
root 0, and such that if ui €T for some u £V and i G N*, then u£T and 
for all j G [[1,^1, uj G T. The elements of a tree are called nodes or vertices. 
For i 7^ j, the nodes ui and uj are called brothers and u their father. We let 
Cu{T) = sup{i:ui G T} be the number of children of u [here Cu(T) will be 
always finite]. A node without any child is called a leaf, and we denote by 
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dT the set of leaves of T. If v ^ 0, we say that uv is a descendant of u and 
u is an ancestor of uv. An edge is a pair {u,v} where u is the father of v. A 
path [n, -yj between the nodes u and u in a tree T is the (minimal) sequence 
of nodes u := uq, . . . ,Uj := v such that, for any i G [0,j — Ij, {ui,Ui+i} is an 
edge. Set also ]ii,t;[[= lu,vj \ {u,v} and similar notation for |n, -yj and for 
]]«,?;]]. The distance cIt, or simply d, is the usual graph distance. The depth 
of u is \u\ = d{0,u). The cardinality of T is denoted by |r|, and we let T 
(resp. Tn) be the set of planar trees (resp. with n edges, i.e., n + 1 vertices). 

A branching walk is a pair (T, i) where T is a tree — called the underlying 
tree — and i, the label function, is an application from T taking its values in 
M. In other words, it is a tree in which every vertex owns a real label. We let 
B be the set of branching walks, and Bn be the branching walks associated 
with trees from 7^. 

We introduce now some randomness and construct a probability distri- 
bution on B and on Bn- 

The set of underlying trees is endowed with the distribution of the fam- 
ily tree of a Galton- Watson (GW) process with offspring distribution /x = 
(^fc)fe>o starting from one individual. In this model, all the nodes have a 
random number of children, according to the distribution fi, independently 
from the other individuals. We denote by T a random tree under this distri- 
bution (see, e.g., [1, 10] and most of the cited papers for more information 
on GW processes and trees). 

The distribution of the labels is defined as follows. Consider {t^k)ke{i,2,...} 
a family of distributions, where ffc is a distribution on R*^. The labels are 
defined conditionally on the underlying tree T: Set i{0) = 0, and for any 
u^T \ dT, consider 

Xu := {e{ui) - e{u), . . .,e{ucu{T)) - e{u)), 

the evolution-vector of the labels between u and its children. Condition- 
ally on T, we assume that the r.v. are independent, and that Xu has 
distribution z^Cu(T)- This determines a distribution on B. For example, if 
Uk is the uniform distribution on { — 1,+!}'^ for any A; > 0, then the r.v. 
£{ul) — i{u), . . . ,£{uCu{T)) — (-{u) are independent with common distribu- 
tion ^((5+1 + 5-i) {5x stands for the Dirac mass at x). In the case where 
is the uniform distribution on {(1, . . . , A;), (— 1, . . . , —A;)}, the r.v. l{ui) — i{u) 
and i{uj) — i{u) are not independent and do not have the same distribution. 

Notice that a sequence of i.i.d. /i-distributed random variables indexed 
by U allows to build the Galton-Watson trees, and a sequence of random 
variables indexed by U x N allows to build all the labels (by attaching to 
the elements of U a list of random variables with distribution 1/1,1/2, ■■ ■)■ We 
assume that we work on an underlying probability space {0,,A,¥) on which 
are defined all the random variables and processes used in this paper. 
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Fig. 1. 
processes. 



A tree on which is indicated the depth-first traversal, its height and contour 



We define now two sets of assumptions (Hi) and (H2) that will be assumed 
to be satisfied in most of our results. (Hi) is the following set of conditions: 
H is nondegenerate, critical and has a bounded support, 



(Hi) := /iQ + /ii / 1, ^ kfik = 1, there exists K > s.t. 
\ k>0 



J2 /^fc 

k<K 
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Under (Hi) the variance o"^ of fi is finite and nonzero. The bounded support 
condition is quite a strong restriction, but considering nonbounded distribu- 
tion leads to nontrivial complications, and we were unable to extend to that 
case the most important results. The condition on the mean can be seen as a 
normalization, since any distribution ft related to /i by /ifc = f^kO'^ / o'^^fc) 
for some a > induces the same distribution as fi on GW-trees conditioned 
by the size. 

Let y^*^^ = (Yfc,i, . . . ,lfc,fc) be z^fc-distributed. We denote by I'kj, ^k,j and 
(t| j the distribution, the mean and the variance of Y]^^. We call global mean 
and global variance of the branching random walk. 



m 



and (3 
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k>lj=l 
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J2 J2 l^krukj 

k>lj=l 

Let (H2) denote the conditions that the global mean is null, the global 
variance finite and positive, and for a p > 4, the centered pth moment of the 
Yfcj's is finite: 

and l3 G (0, +cxd), there exist p> A s.t. for \ 



(H2) := 



m 



any (/c, j), 1 < J < < K,¥.{\Yk,j - m^^jf ) < +00. 



Encoding of branching random walks. We denote by ^ the lexicograph- 
ical order (LO) on the planar trees (and u ~<v \i u and u^v\ and let 
?x(fc) be the feth vertex in the LO [n(0) = 0]. 

We study the asymptotic behavior of branching random walks via their 
encoding by depth-first-traversal. The depth-first traversal of a tree T S 7^ 
is a function 



Ft : {0, ... , 2n} {vertices of T}, 
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Fig. 2. A branching random walk from Bg. On the first column, the contour process and 
the contour label process, on the second column, the height process and the height label 
process. 

which we regard as a walk around T, as follows: -Ft(O) = 0, and given 
Ft{i) = choose if possible and according to the LO, the smallest child 
w oi z which has not already been visited, and set Fxii + 1) = If not 
possible, let FT{i + 1) be the father of z. 

We now encode the branching random walk with the help of a pair of 
processes. For any k € [0, |r| - Ij, let = \u{k)\ and RI = l{u{k)). The 
height process {H'^,s G [0, |T| — 1]) and label process {R'^,s € [0,|T| — 1]) 
are obtained from the sequences (i^J) and (i?^) by linear interpolation. 
Alternatively, one may encode the branching random walk with a pair of 
processes associated with the depth-first traversal: for any k G [[0,2(|T| — 1)}, 
let H^ = \FTik)\ and Rl = l{FT{k)). The processes (^J,sG [0, 2(|T| - 1])]) 
and (-RJ, s e [0, 2(|T| — 1)]), obtained by interpolation, are called respectively 
the contour process and the contour label process; the pair {H'^ , R^) is called 
the head of the discrete snake. See some illustrations on Figures 1 and 2. 

Let d : = gcd{/c, A; > 1, /i/^. > 0}. The support of the distribution of |T| — we 
write supp(|T|) — is included in 1 + dN [and P(|T| = 1 + kd) > for every 
k large enough]. For n + 1 G supp(|T|), the distribution P under the condi- 
tioning by |T| = n + l is denoted by P„, in other words 

P„ = P(-||T| =n + l). 

Even if not recalled, each statement concerning weak convergence under P^^ 
is assumed to be along the subsequence {nk)k for which P^^, is well defined. 
In the proofs we will treat only the case d = l, the general case being treated 
with slight modifications. 
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We define h„, h„, r„ and f„ to be the processes H"^, H'^^R'^ and ii^ 
under P.„, interpolated as follows: 

ttT ttT 
'■'■ny^)— 1 /n ) i^nV'^y 1/9' 

rn(s) = ^, fn(s) = % for any [0,1]. 

Theorem 1. //(Hi) and (H2) are satisfied, then 
(h„,, h„, r„, r„) ^(h, h, /3r, /3r) 

n 

in C([0, endowed with the topology of uniform convergence, where h = 
2e/cr^ and e is the normalized Brownian excursion, and where, conditionally 
on h, r is a centered Gaussian process with covariance function 

cov(r(s), r(t)) = h(s, t) := min h{u) for any s,t & [0,1]. 

MG[sAt,sVt] 

Notice that the same processes h and r appear twice in the limit process. 
The convergence of processes associated with the contour processes (with 
a ^ ) to the same limit as the one associated with the height processes 
is well understood now, and "almost" generic (Duquesne and Le Gall [9], 
Section 2.5, and [21]), we then concentrate only on the height process. The 
process (r,h) (or with a different scaling) will be called the head of the 
Brownian snake with lifetime process the normalized Brownian excursion 
(BSBE). We refer to the works of Le Gall (e.g., [16] and with Duquesne 
[10]) for information on the Brownian snake and to the papers cited below 
for discrete approaches to this object. 

In the present work we deal only with the head of the snakes; this is, 
in principle, different than snakes even if, thanks to the homeomorphism 
theorem [20] evoked below, Theorem 1 has some direct interpretation in 
terms of snakes. We refer to [13, 20] for the notion of discrete snake which 
is the discrete analogue of BSBE: the discrete snake associated with the 
branching random walk (T, £) is the pair {H^, $) where $ = (^*fc)A:e[o,2(|r|-i)] 
and is the sequence of labels on the branch |{0, Ft(A;)]]. 

Related works. The convergence h„^h is due to Aldous [1, 2] (see also 

Marckert and Mokkadem [21] for a revisited proof. Pitman [25], Chapters 
5 and 6 and Duquesne [9] and Duquesne and Le Gall [10], Section 2.5, 
for generalization to GW trees with offspring distribution having infinite 
variance) . 

The two first results concerning the convergence of discrete snakes to the 
BSBE appeared in two independent works: 
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• Chassaing and Schaeffer [7] deal with discrete snakes built on underlying 
trees chosen uniformly in 7^ [this corresponds to the case /i ~ Geom{l/2)] 
and where the displacements are i.i.d., and for any k^j, Vkj is the uniform 
distribution in {—1,0,+!}. They show the convergence of the head of 
the discrete snake for the Skohorod topology, and the convergence of the 
moments of the maximum of r„, are also given. This study was motivated 
by the deep relation between this model of discrete snake and random 
rooted quadrangulations, underlined by the authors. 

• Marckert and Mokkadem [20] studied also the case /i ~ Geo'm{l/2), but 
with more general centered displacements that have moments of order 
6 + e (the distribution u^j does not depend on k,j, but is not assumed 
to be I'k,! X • • • X i'k,k)- The convergence of the head of the snake holds 
m (C[0,1],M2) and the convergence of the snake itself is given thanks to 
a "homeomorphism theorem" which implies that the convergence of the 
snake and of its tour (in space of continuous functions) are equivalent. 
Here it implies that, under the hypothesis of Theorem 1, the discrete 
snake associated with our model of labeled trees converges weakly to the 
BSBE. 

Then some generalizations appeared few months later: 

• Gittenberger [11] provides a generalization of a lemma from [20] allowing 
him to consider snakes with underlying GW trees conditioned by the size 
(condition equivalent to Hi). The displacements must be centered and 
have moments of order 8 + e. 

• Janson and Marckert [13] show that, in the i.i.d. case [I'kj do not depend 
on {k,j)], moments of order 4 + e are necessary and sufficient to get the 
convergence to the BSBE. If no such moment exists, the convergence to 
a "hairy snake" is proved under the Hausdorff topology. 

• In Marckert and Miermont [19], the case of v^j depending on k,j is inves- 
tigated (also the underlying GW trees are allowed to have two types). The 
hypothesis are for each k,j, m^j = 0, condition (H2) is satisfied, and then 
^k,j /^fc'^fc j < +00- A motivation was to generalize the works of Chassaing 
and Schaeffer [7] concerning quadrangulations to bipartite maps. 

Another important point is the convergence of the occupation measure of 
the head of the discrete snake to the one of the BSBE, the random measure 
named ISE (the integrated superBrownian excursion introduced by Aldous 
[3]; see also Le Gall [16] and [13, 20]). Using the convergence of the discrete 
snake to the BSBE, Bousquet-Melou [4] and Bousquet-Melou and Janson 
[5] deduce new results on the ISE and on the BSBE; for example, some 
properties on the support of the ISE, and of random density of the ISE are 
derived. We refer also to Le Gall [15] for the convergence of the discrete 
snake conditioned to stay positive. 
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The novelty in the present paper is that the condition {rukj = 0, V/c, j} is 
replaced by m = J2k>i Sj=i f^kn^'kj = 0. This allows to consider some natu- 
ral models where, for example, the displacements are not random knowing 
the underlying tree (see Section 1.3). The proof of Theorem 1 relies in part 
on some results from [19], and on a new approach necessary to control the 
contribution of the mean of the displacements; the main point for this is 
the comparison of the lineage of each node, with some multinomial r.v. This 
is the aim of Theorem 2, that we think interesting in itself, since it reveals 
a thin global behavior of GW trees conditioned by the size. Unfortunately, 
the price of this generalization is to consider only offspring distribution with 
bounded support. The reason comes from the proof of Theorem 2. We guess 
that some generalization for all families of GW trees (with finite variance) 
may be found, but for this a control of an infinite sequence of processes 
arising in Theorem 2 should be provided for what we were unable to do. 

1.2. On the lineage of nodes. Assume that (Hi) and (H2) hold, and 
let i^' be a bound on the support of the offspring distribution. We work 
again conditionally on T. For any node u = . . . i/j € T, let uj = ii . . .ij 
and |0, uj = {0 = uo,ui, . . . , u^} be the ancestral line of u back to the root. 
Conditionally on T, £{u) owns the following representations: 

|n| 

(1) e{u)=Y.£{Um)-i{Um-l), 

m=l 

where i{urn) — i{um-i) is z^^j-distributed when Cu„^_-^{T) = k and im = j, 
and where the r.v. {£{um) — ^(tim-i))'s are independent (conditionally on 
T); the variables i{um) — (-{um-i) will be often called displacements. 
Consider the array 

lK = {{K3)A<3<k<K}. 

Let T gT and u be a node of T. For any {k,j) € Ik, let ^„^fcj(T) be the 
number of strict ancestors v of u (the nodes v G |l0,ti|) such that c„(T) = k, 
and such that ti is a descendant of vj, the jth child of v [we write fv{u) = j]. 
We say that v is an ancestor of type k,j of u, and we call the vector = 
{Au^i)ii^ij^ the lineage of u (or the content of [[0,u]]). See Figure 3. 

By (1), conditionally on T, the label i{u) owns the following representa- 
tions: 

where the r.v. Y^^j are independent, and where for any Y^^j is v^j dis- 
tributed. In order to make more apparent the contribution of the means 
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mfcj 's, and using that m = 0, write 

k j (T) 

(2) l{u)^^ E E {Au,kAT^)-^k\u\)m,,. 

Assume that T is distributed, and that u = u{ns) for some s E (0,1). 
Conditionally on \u\ , we will see that both parts of the right-hand side of (2) 
divided by n^/^ converge in distribution, and the limit r.v. are asymptotically 
independent: in the first part, the fluctuations of A^^kj around ^kW\ are not 
important when they are crucial in the second sum. 

We now concentrate on the r.v. {Ay)' s under P„. For any / G |IO,n], {k,j) G 
Ik, set 

For every (A;, j) E Ik, the process 1 1— > g|^j)(0 encodes the evolution of the 

number of ancestors of type k,j of n(/), when I varies. Consider G^"^ = 
(G'-"'-*(s))j,g[o^i] the process taking its values in defined by, for any s, 

G(")(s) = (G^"- (s))(fc j)g7^, where s ^ g[,"- (s) is the real continuous process 

that interpolates g^"j as follows: 



(3) Gg(.):= -'^ - , .G[0,1], 

where {x} stands for the rational part of x. The random process G*-"^ en- 
codes the lineage of all the nodes of T; its limiting behavior is described by 
the following theorem. 

Theorem 2. Under (Hi) and (H2) the following convergence in dis- 
tribution holds in C([0, 1])*^^^' x C[0, 1] endowed with the topology of the 
uniform convergence 

(G("),h„)^(G,h), 



( \ns\ ) + {ns] (gg {[ns + l\)- gg ( [ns\ )) 



Fig. 3. On this tree = 1,^11,2,2 ~ l,Au,4,2 ~ l,^u,5,3 = 1, the others Au,i are 0. 
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where h is defined as in Theorem 1, and where conditionally on h, G = 
(Gfcj(s))(fcj)g/^^sg[o^i] is a real centered Gaussian field with the following 
covariance function: for any {k,j) and {k',j') in Ik, s and s' in [0,1], 

(4) cov{Gk,j{s),Gk',j'{s')) = {-fikHk' + fJ'kti^k,j)={k'j'))Hs,s'). 

1.3. Comments, examples and applications. (1) Theorem 2 may be con- 
sidered as the strongest result of this paper. It gives very precise information 
on the asymptotic behavior of the process G*-""^ that encodes the hneage of 
all the nodes. This gives a "global asymptotic" property reminiscent of the 
properties of the distinguished branch in "a size biased GW tree" (see [17], 
Chapter 11). 

(2) For any fixed {k,j) G Ik, knowing h, Gkj is a Gaussian process with 
covariance function 

cov(Gfcj(s),Gfcj(s')) = (-//| + /ifc)h(s,s'). 



In other words, the process (G^j, h) has the same distribution as (V— /u| + /i^r, h), 
and then up to some multiplicative constants, (Gfcj,h) is the head of a 
BSBE. As a simple consequence of Theorem 2, we have that (G^j, h)(fc ,,)g/^^ 
is a sequence of heads of BSBE, and that for any {k,j) € Ik, 

(5) (Gg,h„)^(Gfc,„h). 

The dependence between the different processes Gkj is ruled out by (4). 
For any families of real numbers {^k,j){k,j)£i[i^ have 



(6) ( U Xk,Gt:],hA^^iY^Xk,Gk„h 





(3) Consider the case n = ■^{Sq + 62) , 1^2 = of binary trees in which 

the displacements are not random: i{ul) — i{u) = +1 and i{u2) — i{u) = 
— 1. We have m = and = ^(1 + 1) = 1 and Theorems 1 and 2 apply. 
Hence, the clear positive bias for Rn{t) for small values of t disappears at 
the limit. Note also that this normalizing factor is exactly the same as if 
^2 = [case where {i{ul) — £{u),£{u2) — £{u)) is equally 

likely (+1,-1) or (— 1,+1)] and as if U2 = (^(<5+i [case where the 

£{ul) — £{u) and £{u2) — £{u) are i.i.d., uniform on {—1, 1}]. The question of 
the convergence of the discrete snake in the case V2 = appears first 

in Marckert [18] in relation with some properties of the rotation correspon- 
dence, and the difference between left and right depth in binary trees. The 
convergence of (r„) is not given in [18], but the convergence of the occupa- 
tion measure of r.„, "the discrete ISE," to ISE is established. We refer also 
to Janson [12] for recent developments concerning the same question. 
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Further, notice that in this model, the label l{u) of a vertex u is i{u) = 
^u,2,i — ^M,2,2) that is, the number of left steps minus the number of right 
steps necessary to climb from the root to u in the binary tree. The conver- 
gence of (r„) can be seen directly via the one of (G^"^): 

(7) (Gj^i ,G2"2\h„) U(G2,i, G2,2,h), 

and then r„ = gI"? — Gn^i ^ G2 1 — G2 2 which is, conditionally to h and ac- 

cording to (4), a centered Gaussian process with covariance function h.(s,t). 
Here, the convergence of (r„) appears to be a consequence of the conver- 
gence of G2,i and G2,2) encoding the right depth and the left depth in binary 
trees. 

We would like to stress on the following point: discrete snake are usu- 
ally constructed with "two levels of randomness": the underlying trees are 
random and so are the displacements given the underlying tree, and then 
BSBE appears to be a natural limit of these objects. Here, we provide some 
objects with only "one level of randomness" that converges to the Brownian 
snake. The BSBE appears as a kind of internal complexity measure in trees 
measuring the difference between the number of ancestors of type k,j and 
some expected quantities. 

2. Proofs. The proofs rely on a precise study of the lineage of the nodes 
under P„ and, in particular, on the comparison of Au with a multinomial 
random variable. For this reason, we first give some elements on multinomial 
distributions and on their asymptotic behaviors. We then proceed to the 
proof of Theorem 2, showing first the convergence of the uni-dimensional 
distribution, then the convergence of the finite-dimensional distribution. The 
proof of Theorem 1 is given afterward. We think that some points of view, 
especially in the description of the distribution of the lineages in trees under 
Fn, should provide some new approaches to study the trees under P„. 

2.1. Prerequisite on multinomial distributions. The contents of this sec- 
tion are quite classical. Consider p = {pi)i^ij^ the distribution on Ik, defined 

by 

Pk,j ■■= P-k for any {k,j) £ Ik- 

For any h>l, let N^[/i] be the set of elements c= {ci)i£ij^ of N*^^, such 
that J2i£iK ~ ^- ^^^^ M.^'^^ is a multinomial r.v. with parameter 

h and p, if, for any m = (mj)jg7^, 

Q,({m}) :=P(A^W = m) = ((^^Jj^^^) 11 Pr^mh]i^), 



GLOBALLY CENTERED DISCRETE SNAKES 11 

where {(^m y^i ) ~ ^'/(Oie/K "^j')- Recall that for any i G Ik, ■M['^^ is a 
binomial r.v. with parameters n and pi. 

In order to fit with further considerations, we introduce the i^Ix dimen- 
sional real vector G{n,h) = {Qi{n,h))i^ij^ defined by 

Qk,j{n, h) = - likh) for any (/c, j) G Ik- 

Let ^oo = {Qoo,i)i&iK be a centered Gaussian vector having as covariance 
function 

(8) cov(^oo,i, ^oo,i') = -PiVi' + Piti=i' for any £ Ik- 

Proposition 3. Let {h{n)) be a sequence of positive integers s.t. 
h{n)/y/n^ A G (0, +cxo). Under (Hi), we have G{n, h{n)) ^—l \/A^oo in K*^^ . 

Proof. This may be proved using classical tools. As pointed out by E. 
Rio in a personal discussion, this is also a consequence of the convergence of 
the empirical process to the Brownian bridge. We only sketch the proof (for 
A = 1): let (Ui)i be a sequence of i.i.d. r.v. uniform on [0, 1]. Let Fn be the 
associated empirical distribution function and F the distribution function 
of U. Denote by gn = Fn — F. 

According to Donsker [8], -^/n(7„— >b, where b is a normalized Brownian 
bridge. 

Take q = {qi)iizn a distribution on N and consider 

A^i"^ = J G {1, . . . , n}, f/j- G [gi + • • • + gfc, gi + • • • + qk+i]}. 

Then {■N''j^^)k>i is a multinomial r.v. with parameters n and q and satisfies 

(A/"^"^ - qkn) / Vn = \/n{gn{qi H \- qk+i) - 9n{qi H Vqk))- 

By Donsker, for any L > 0, {{M^'^ — qk^) / y/n)k<L converges in distribution 
to 

b(?iH — hf/fc+i ~ ^qiA — \'qk)k<L- The properties of b allow to conclude. □ 

The following proposition will be used in the proof of the tightness of 
(GW). 

Proposition 4. Under (Hi), for any f3>l, there exists c > such that, 
for any h > 0, any n > 0, 
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Recall that all the norms are equivalent in M*^^. Here, we use \\X\\i = 

Proof of Proposition 4. First, since \\X\\i < cJ2\Xk,jf for some 
OO, 

{k,j)eiK 

Since A^^'j is a binomial random variable with parameter fi^ and h, E(| - 

f^kh)\^) < C{fik, P)hf^^'^ , where the constant C{fik,P) depends on /xfc and f3 
(see Petrov [24], Theorem 2.10, page 62). □ 

2.2. Decomposition of trees using the lineages. A forest is a finite se- 
quence of trees, that is an element of JT := IJ^^qT'^. For any /c G N, a forest 
with k roots is a /c-tuple of planar trees f = {t^ , . . . ,t^). The size |/| of / is 

\t^\-\ h \t^\- We denote by = (T-^, . . . ,T'^) a random forest in which the 

trees T^,...,T^ are i.i.d. GW trees with offspring distribution /j,. For any 
a = (afcj)(fcj)G/A' ^ I^^S write 

Ni{a)= and iV2(a) = - j)^Kr 

ik,j)eiK ik,j)eiK 

Proposition 5. Let h be a nonnegative integer. For any a G N''^[/i], and 
any m G [O, n\, 

(9) P„(A,(„) = a) = Qft(a) ^{XT\=n) ' 

where f and f are two independent forests. 

Proof. To build a tree T oiTn such that ^^(m) = a, we first build the 
branch h = [[0,ti(m)]]: Exactly a^^- ancestors v among the h strict ancestors 
of u satisfy {ctj{T), fy{u)) = {k,j). Hence, there are Q) ways to build b. 
Then, we complete b in grafting on its neighbors some subtrees satisfying 
the following constraints. When ^^(m) = a, the number of subtrees rooted 
on the neighbors of the branch [0,ti(m)[ visited before u{m) [resp. after 
u{m)] are respectively 

iVi(a) = #{w,d{l0,u{m)l,w) = l,w-< u{m)}, 

1 + iV2(a) = i^i{u{m)}U{w,d{ju{m),0j,w) = l,u{m) ^ w}). 

See an illustration on Figure 4. The A^i(a) subtrees must contain exactly m — 
\u{m)\ nodes (the nodes, among the m + 1 first, not on [0,ii(m)]]), and the 
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Fig. 4. The two forests eonsidered in the decomposition. 

1 + A^2(3) subtrees must contain exactly n + 1 — m nodes [the nodes visited 
after u{m), u{m) included]. In other words, we need two forests containing 
respectively m — h and n + 1 — m nodes. Hence, using simple considerations 
on the probability distribution of GW trees, we get the announced result. 
□ 



A consequence of this proposition is 

¥n{\u{m)\=h) 



IF'(|T|=n) 

_ JP(|fjV,(A4W)l ="^-^^|f(+jV,(>iW)l = ^ + 1""^) 

( ) ~ P(|T| =n) ' 

where A^^'^^ is a multinomial random variable with parameters h and p. 



2.2.1. Few facts concerning random forests and random trees. Let (Wj)j>o 
be a random walk starting from with i.i.d. increments with distribution 
(Afc)fc>-1 = (/^fc+i)fc>-i (i-e., with increment ^ — 1, where ^ is ^u-distributed). 
We have the following: 



Lemma 6. Assume (Hi) holds true. 



(i) ( Otter [23] j For any k>l and n>k, 

(ii) [Central local limit theorem (CLLT)] 



■ n) 



^nwn=-k). 



(12) 



sup 



d 



= I) 



1 



27r(T, 



■ exp 



2aln 



(iii) sup„>oSup^>o x¥{Wn 



■ x) < +CO. 
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(i) is often called "conjugation of tree principle" or "cyclical lemma," 
and may be found in Pitman [25], Chapter 5.1 and is usually attributed to 
Otter, Kemperman or Dvoretzky-Motzkin. 

(ii) is usually called the central local limit theorem (see Breuillard [6] for a 
state of the art). Recall that d is the span of /i. The support of Wn is included 
in — n + dN = {u £ Z,u = —n + di, z G N}. A consequence of (i) and (ii) is 
that 

(13) P(|T|=?l)r 



the equivalent being taken along the subsequence where the left-hand side 
is nonnull. 

Proof of Lemma 6. (iii) sup„>o sup^y^^{xF{Wn = x)} is bounded by 
the Chebyshev inequality. By (ii), sup^<£^ V^F{Wn = x) ^^75=^, then 
sup„>QSup^<c^y^P(Wri. = a;) is finite. □ 

The following lemma controls the maximum increment in the process H 
under P„. 

Lemma 7. Assume (Hi). For any c > 0, there exists p> such that 
Fn(m&x{\\u{l + 1)\ - \u{l)\\} > plogr?j = Oin-"). 

Proof. The proof deeply relies on the conjugation of the tree prin- 
ciple. Take n + 1 i.i.d. r.v. Xi, . . . ,Xn^i, /i-distributed. Conditionally on 
J27=ii-^i ~ 1) = ~1> among the n + 1 shifted sequences {Xi, . . . ,Xn+i), 
{X2, . . . , Xn+i, Xi), . . . , {Xn+i, Xi, . . . , Xn) , exactly one {X^ , . . . , X*_^_^) cor- 
responds to a sequence (c^, u G T) for a tree T G 7^ (where the c„ are sorted 

according to the depth first order), and {X^, . . . ,X*^i) = (cu,m G T) for T 
under P„. 

The inequality \\u{l)\ — \u{l + 1)\\ = h> 1 implies that \u{l + 1)| < |u(/)|, 
and the deepest common ancestor v of u{l + 1) and u{l) has depth |ii(/ + 1)| — 
1. Assume that the tree is visited according to the reversed LO (the order on 
the alphabet N is reversed, but if z is a prefix of z' , z is still smaller than z': 
this amounts to walk around the tree counterclockwise and rank the nodes 
according to their first visit time). In the reversed LO, the nodes in ,u(/)|{ 
are visited consecutively, and each of them has at least one child. Under P, 
when traversing the tree in the LO (or by symmetry in the reversed LO), the 
gap between two nodes having zero child is a geometrical r.v. Geom{fiQ). We 
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work now on the LO order. Denote by Xi, . . . ,Xn+i i.i.d. random variables 
/x-distributed and by Gi, G2, . . . the successive gaps between the zeros: 



plogn 
maxG, > 



YliXi - 1) = -1^ = 0(^rai/¥(^maxQ 



for p large enough. Note that the first maximum is taken on a random num- 
ber of terms, a.s. bounded by n. By the conjugation of the tree principle, 
we get the result. □ 

Remark 1. Using the same argument, one may control the depth of the 
last node u{n): for any c > 0, there exists p>0 such that 

(14) P„(|ti(n)| > plogn) = 0{n-''). 

For n G T, / G [0, \u\} and {k,j) G Ik, let A^^i^kj be the number of ancestors 
V G [0,^11 such that d{u,v) < I, and for which c„(T) = k and fv{u) = j. 

Lemma 8. (i) For every c> 0, there exists 7 > 0, such that, for n large 
enough, 



P„(3(A;,i) elK,ue T, \Au,k,j - Pk\u\\ > 7\/ 1^1 ^ogn) < n'". 
(ii) For every c> 0, there exists 7 > such that, for n large enough, 
¥n(3{k,j) £lK,u£T,l£ (0, \u\], \Au,i,k,j - l^kl\ > 7V^logn) < n'". 

Proof, (ii) clearly implies (i). But let us prove (i) first. Using (9) and 
(13), we have for some constant c > 0, for any m G [[0,n]], any h>l, any 
a gN^[/i], 

(15) IPn(A„M = a) < cn^/^qh{3)U<n- 

Then 

F„(3mG |0,n],(A;,j) G //r, - /ife|n(m)| | > j^J\u{m)\ log n) 



< cn'/^ E j) G Ik, |A^S " ^^k\h\ \ > ^Vhl^). 



i(h) 
■J 

m=0 h=0 



This latter probability is smaller, for any m < n, h < n than T^/^n 

by Hoeffding. Hence, P„(3(A;, j) £Ik,u£ T, {A^^kj - Pfc|'w|| >7\/H^ogn) < 

cn^^'^n"'^"' . 

For (ii), assume that u{m) = h and for / < h, take vi,. . . ,vi the ances- 
tors of u{m) at depth < hi < • • • < hi < h, and set i k j ~ ^i^' '^■"i ~ 
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Fig. 5. Exchange of two nodes in a lineage. 



k, f^^{u{m)) =j}, the lineage of u{m) restricted to the nodes fj's. By "sym- 
metry," and iAu(m),i,k,j)k,j have the same distributions. Here 
"symmetry" means the fohowing: let vi and V2 be two ancestors of u{m). 
Exchange in T the two nodes vi and V2 together with the subtrees rooted 
on their children not on [0,M(m)|, as on Figure 5. We get T' . First T' and 
T have the same weight under P„. Second, has the same value in T 
and T', and the nodes u{m) in T and T' have the same depth [u[m) is by 
definition the mth node]. 

Now take v the ancestor of u{m) at depth /. By symmetry, {Ay^k,j)k,j 
and {Au[m),i,k,j)k,j have the same distributions. And thus, by (i), for any 
m <n, I <n, ¥{3{k,j) G lK,\A^{m),l,k,j - /^fcl^ll > tV^ log n ) is certainly 
smaller than crJ/'^n~'^'^ . As a direct consequence, cnJ /'^~^'^n~'^'^ is a bound 
for P„(3(/c,i) £lK,u£T,l£ (0, \u\), |^„,z,fcj - Hkl\ > 7\/^logn). □ 



We end this section with a result concerning multinomial random vari- 
ables. For any h> 0, set 



J/,= aGN^[/i],(iVi(a),iV2(a))G 



2 '2 



Lemma 9. For any hen, Ni{M^^y) and N2{M^''^) have the same law 
and there exists ci > 0, C2 > such that 



P(A^(^) ^ Jh) < ci exp(-C2/i^/^). 



Proof. The first assertion is easy. Writing {|Afi(>i ^1 > n^/^} c 
Ufc,i{lMtJ-/^/^A.| >/i'/V#^/^} [checkthatEfc,,(j-lK = Efcj(fc-iK = 
4], by Hoeffding, one has F{\Mf]-hfik\ > H'^/^/^Ik) < 2exp(-/iV3/(2#/^)). 
Summing this for {k,j) G Ik, the result is shown to be true. □ 
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2.2.2. A first comparison lemma. In this section S denotes a Polish 
space. For any r.v. X taking its values in S, we denote by ¥x the dis- 
tribution of X: that is, Fx{A) = F{X £ A) for any A Borehan of S. 

Definition 1. Letting {Yi,Y2,...) and {Xi,X2,. ■ ■) be two sequences 
of r.v. taking their values in S such that Px„ is absolutely continuous with 
respect to Py^, we write Fx„ -< Py„- Let /„ be a nonnegative measurable 
function such that Px„ = fn^Yn [in other words, F'x„{A) = Jj^fndFY„ 
for any Borelian A of 5]; the existence of fn is ensured by the Radon- 
Nikodym theorem. For any e > 0, let := {x,\fn{x) — 1| < e}- We say 
that Px„/Py„ ^ 1, or Xn//Yn 1, if for any e > 0, Py„(^^) ^ 1 (this is a 
convergence of /„ to 1 in a weak sense). 

UXn//Yn ^ 1, then Px„(^?) ^ 1, and for any B C A^, |Py„(S) -PxjS)| < 
ePy„(5), therefore, sup^ Borelian I I^x^ (5) - ^Y„iB)\, the total variation dis- 
tance between Xn and Yn, goes to 0. Hence, the following lemma is a straight- 
forward consequence of the Portmanteau theorem: 

Lemma 10. // X„//y„ ^ 1 and Yn ^ Y, then Xn -^^n^ Y. 



We end this section by an argument of continuity: 



Lemma 11. Let be a sequence of continuous functions from S into 
a Polish space S' . If Xn//Yn 1, then gn{Xn) / / gniYn) 1- 



Proof. If Px„ -< Py„ , then so do Pg„(jjc„) ~< ^g„{Y„) , and then there exists 
a nonnegative measurable function hn such that Pg^(x„) = ^nIPg„(y„)- As 
above, denote A^, := {x,\fnix) — 1| < e'} where /„ satisfies ¥x„ = frJ^Yn 



and Py„(^^,) 
AcB^,, 



1 and set B^, 



gn{A^:)- We have 



^ 1. For any 



{hn-l)dF, 



9n{Y„) 



the inequality follows that (7„^(A) C A^,. Letting now e > be fixed and 
setting A = {x, /i„(x) - 1 > ej Pi -B" (or A = {x, hn{x) - 1 < -e} CiB'^,), we 
get Pg^(y^)({x, \hn{x) — 1| > e}) < 2e'/e; choosing e' > small, one sees that 
this is arbitrarily small for n large enough. □ 



2.2.3. Proof of the convergence of the uni- dimensional distributions in 
Theorem 2. In this section we work under P„. Let XJ^ := (A„(^), |ii(m)|) 
and Y^ := {A^, \u(m)\), where the distribution of A!^ knowing |n(m)| = h 
is simply Q/^. The aim of this section is to compare X^ with Y^ and to 
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deduce from the asymptotic behavior of the one of X^. The proof of the 
convergence of the finite-dimensional distributions wih also use this strategy. 
For M > 0, and n G N, consider 

Kn,M = {(a,/i),/i G y^[M-\M],a G A}. 

We have the following: 

Proposition 12. (i) For any m,n, with m < n, we have Px;^ -< ^Y^- 
(ii) For any s G (0,1), a > 0, there exists Mq s.t. for n large enough, 
^niY^'^s] ^ ^n,Mo) > I- a and for any M > 0, 

n(^r-j=(a>M) 



(16) sup 

(a,h)GA„,A/ 



-^0. 

n 



(ifi) For any s G (0, 1), X^^^^ //Y(^^^^ ^ 1. 

Proof, (iii) is a consequence of (ii). Let a G N/[/i]. Since {^^(m) = a} C 
{\u{m)\ = h}, P.„((^„(^),|u(m)|) = (a,/i)) = P.„(^„(„) = a). According to 
Proposition 5 and formula (10), 

^^^^ P„(X« =(a,/i)) _ n|fjvi(a)l=rn-/i,|f{+^^(3)|=n + l-m) 



\(y^ = (a,/i)) P(|f^^(^(h))| = m-/i,|f(^^^^_^,„)^| =n + l-m)' 

Then (i) holds true. Assume now that s G (0, 1) and a > are fixed. There 
exists M such that, for n large enough, P„(|M([nsJ)| G ^/n[M~^ , M]) > 1 — 

a/2 [since h„ ^—l — e and since P(es = 0) = for any s G (0, 1)]. For such a 
M, 

P„(yLLjeA„,M) 

= P„(yLLj e An,MXM)| e V^[M-\M]) 

= lPn(|n(M)|=/)P„(yLLjeAn,M||u(LnsJ)| = 

min 

Zev/n[M-i,A/] 

This minimum goes to 1, thanks to Lemma 9. 

Now, according to Lemma 6(i) and (ii), since f and f are independent, 

IP(|fjVi(a)l = [nsl -/i, |f(+^2(3)| =n+l- [ns\) 

Ni{a){l + N2{a)) 
{[ns\ - h){n + l - [ns\) 

X nw^^s\-h = -iVi(a))p(w„_Ln.j+i = -m^) - 1)> 
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and then for any M > 0, 



sup 

(a,/i)eA„^M 



|fjVi(a)l = [ns\ - h,\f[_^^^^^^\=n+l- [ns\) ^ 

Qn,s,h 

for 

a>2exp(-a>V(8ns(l-s))) 



ln,s,h 



87r7l3(s(l-s))3/2 

Now, P(|fjv^(_^(h))| = [ns\-h, |f{+^2(A^W)l = 1 + ^^- \ris\) = Ah + Bh, where 

A :=IP(|f7Vi(A^W)l = L'^sJ -/i,|f{+jv2(>!W)l = 1 + ^- L^^J"^^''^ ^-^z^) 
S,, :=P(|f^^(_^(.))| = [nsj - /i,|f{+^^(_^(,))| = 1 + n- M,A^W G J;,). 
Using again Lemma 6(i) and (ii), we get 

Bh 



sup 

/iev^[A/-i,A/] 

On the other hand. Ah < F{M'^^'^ ^ Jh) < ci exp(-C2/i^/^) < 2 exp(-cn^/^/M) 
for any € \/n[M~^,M]. To complete the proof of (ii), check that 
SUPftg^[A/-i,M] \Ah/Bh\-^0. □ 

We have now all the tools to conclude the following: 
Corollary 13. For any s £ (0, 1), letting Sn = [ns\/n, we have 

(G(")(s„), h„(s„))//(g(n, V^hn{Sn)),K{Sn)) ^ 1, 

and the convergence of the uni- dimensional distributions holds in Theorem 2. 
(Recall that Q is defined in Section 2.1.) 

Proof of Corollary 13. Proposition 12 and Lemma 11 yield the 
first assertion of the Corollary. 

For the second assertion, we first examine s = and s = 1. Since h„(0) = 

and Remark 1 entails that h„(l) g, the convergence of the uni-dimensional 
distributions holds in Theorem 2 for s = and s = 1 . 

For s G (0,1), since h„^h in C[0, 1], by the Skorohod representation 

theorem [14], Theorem 3.30, there exists a probability space on which this 
convergence is a.s. On this space [or on an augmented space on which the 
pair (^(n,^h„(s„)),h„(s„))] is defined 

(18) (g(n,V^h„(s„)),h„(s„))^(g^%h,), 
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where is a Gaussian process which covariance function [see (8)] aUows to 

check that {Q^ , h^) = (G(s), hg) for any s G (0, 1). To prove that the conver- 
gence of the uni-dimensional distribution holds in Theorem 2, it remains to 
control the distance between (G^"')(s„),h„(s„)) and (G(")(s),h„(s)). Since 

h„^h in C[0,ll, |h„(s„) — h„(s)| ^'5'^' 0. For G^"\ this is more complex, 

n n 

and we will establish some bounds useful also for the proof of the tightness. 
Let 

JIP = |rGT„,max||n(/ + l)| - \u{l)\\ <plogn|. 

Let £ > 0. According to Lemma 7, for p large enough, P„(r2^) > 1 — e for n 
large enough. We have, for = [ns + Ij /n, 

(19) ||G(«)(s.)-G('^)(.)||,l^.=n(.-s„) Y: lf,dGS"^(4)-GS"^(^n)l- 

In Q^, for any k,j, the differences |G^"j(s^) — G^"j(sn)| are bounded by 

2pn~^/^\ogn (which is a bound on the number of noncommon ancestors of 
two consecutive nodes in the LO for a tree in ^2^). Hence, since s — s„ < 1/n, 
for any e' > 0, for n large enough, 

(20) ||G(")(.„) - gW(.)|M^p < c{s - s,,f''-^' 

for some constant c. One concludes that ||G(")(s„) - G(")(s)||/''^'''0. □ 

2.3. Convergence of the finite- dimensional distributions. In this Section 
K > 2 is a fixed integer. We denote by s^'^^ the vector {si, . . . ,Sk) where 
< si < • • • < < 1 are fixed. Let T G 7^. For i G Jl, k], set = u{[nsi\), 
uq = y-K+i = 0) and 

L{T)={ui,i£ 11,4}. 

We assume that n is large enough such that [nsi\ > 1, and [nsjj / [nsj\ 
for i^ j, so that the ut's are different nodes of T sorted according to the 
LO. 

The aim of this section is to study the distribution of (^Mi)ie[i,re| under 
¥n, and to deduce from this the convergence of the finite-dimensional dis- 
tribution in Theorem 2. The ideas are of the same type as in the case of the 
uni-dimensional distributions, but the details are more involved since the 
dependences between the r.v. A^^s must be taken into account. For this, 
the shape of the tree spanned by the tij's must be considered. 

Denote by Uij the deepest (i.e., youngest) common ancestor of Ui and Uj. 
Let T^(k) = [ji=il0,Uil be the subtree "spanned" by the Uj's, 

Z{T) = {-Ujj,! <i<j<K} = {ui^i+i,i£ - 
Z*{T) = Z{T)nL{T), 
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the set of branching nodes in T^{k.), and the nodes in L{T) that are ancestors 
of other nodes of L{T). 

Definition 2. The shape function h{n) associates with T^(k) the small- 
est tree in T having the same branching structure (that we call shape) 
together with a coding of the nodes of Z*(T) (see Figure 6). Formally, 

6(n):T„ x [0, 1]'' ^ T x ^^^(U), 

(T,s('^))^(T^("),4(")), 

where 7^^(111) is the set of finite subsets of U and where T*(") is characterized 
by: 

(i) r''(") is a tree having #{Z{T) U L{T) U {0}) nodes, 

(ii) there exists an increasing function from Z{T) U L{T) U {0} in 
rpb(n) ^ preserving the descendants: $t('u) is an ancestor of ^xiv) in T^('") iff 
u is an ancestor of v in T. 

The set is defined to be ^t{Z*{T)). 

The tree T^^") can be constructed in somehow squeezing the paths be- 
tween the nodes of Z{T) U L(T) U {0} in unit length edge and in renaming 
the vertices in order to get a tree. The function $t is unique and for short, 
for any u G Z{T) U L(T) U {0}, we write tt*(") instead of <^t{u). The set 
encodes the images of the nodes of Z*(T). Notice that #4"^ = k- #9r^('") , 
and when is not empty, the tree T^^") alone is not sufficient in general 
to guess Irj^ ' . 

In what follows, we will often write h instead of bin). 

A pair {u,v) [with u,v L(T) U Z(T){0}] such that u'' = fa{v^) is the 
father of in will be called a spanned branch. The contents of the 
spanned branches will be carefully handled since they contribute in general 
to several A^^ 's. The set of spanned branches can naturally be indexed by the 
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edges (n^ v'') of T^ but also by the nodes of \ {0} using the bijection 
between the edges of and T'' \ {0} that associates with the edge {u^,v^) 
the node v''. 

Using this labehng, we define ^(^6) the content of the spanned edge {u,v) 

by 

\vb),k,j ■= #{W e}u,vl,C^, = k,fyj{v) =j]. 

The extremities of the spanned branches are not counted in the A(^^b-^^^.jS in 
order to simphfy the decompositions (0 was counted in the unidimensional 
case). It is easy to check that 

(21) = {^v,kj - ^n,fcj) - Hcuju{v)){^^j)- 

We also introduce the "ordered content" of the edges. For any f ^ G \ {0}, 
define ^(^,6-)(T), the ordered content of the edge {u,v) by 

'^{vb){T) ■■= {{cw, fw{v)),w elu^vl), 

the nodes of being sorted according to the LO. 

We write simply A{T) = (^(t,fc)(r))^(,gy6^|0|, the list of ordered contents 
of the spanned edges. 

The ordered content of any edge belongs to Ui>o(-^^f)*- The canonical 
surjection vr from Ui>o(-^^)* associates with the ordered content 

^ = {{ki,ji),i = 1, . . . ,1) the content B: 

Bkj = {'K0))k,j = (kiji) = {k,j)}. 

The application vr can be extended to the list of ordered contents, and we 
set 

A{T) =vr((^(„b)(r)))„bg^6\|0| = {■K(l^^b){T)))^b(,Tb\^0y 

The definition of A^i and (given in Section 2.2) are extended to Ui>o(-^^)*- 
we set, 

Ni0) := Ni{7r0)) and N20) := N2{-k0)). 
We denote by H^b the cardinality #]]u, f | where u'' = fa{v^). We set 

Ht = (-f^t,i>)i)f'eT''\{0}5 

the ordered list of the spanned branches lengths. 

The nodes of Z{T) U {0} are the "hinge nodes" laying between the 
spanned edges, and they also contribute to the ^u/s. For any u in Z{T) U 
{0}, the set fu{L{T)) is a subset of [1, . . . , Cu(T)] with c^b{T^) elements: 
the set of the ranks of the children of u that are ancestors of the nodes of 
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Fig. 7. The forests considered in the decomposition are surrendered. Notice that the 
node Ui belong to a forest only if it is a leaf in T^(k) . Observe also the contribution of the 
neighbors of the "hinge nodes, " the nodes Zi 's on the picture. 

L(T). We encode the contribution of Z(T) U {0} to the hneage, thanks to 
the sequence Qt- 



where C^u'') = Cu{T) and < • • • < R{v}>C{vf')) is the sorted Ust of the 

elements of fu{L{T)). Note that ti^l, . . . , li^C(u^) are the children of in 
and then the arguments of R are unambiguous. 

The idea now is the following. If (T^("), 4^"\ ]l (T), Bt) is known, to end 
the description of T using T^i^) , it remains to describe the fringe subtrees 
rooted in the neighborhood of T^(k.) (the fringe subtree of T rooted at u is 
Tu = {v £ U -.uv £ T}). We pack these subtrees into forests that are, up to 
some border effects, rooted on the neighbors of T^iK) between Ui and Ui+i. 

For any simple path / in T, we denote by the neighborhood of /: 



We now build the set of roots of the forest we consider (see Figure 7) : 
So = {ve AA([0,ui[), -< i; -< -ui}, 

^ _ r G M{jui,Ui+i l),Ui<v^ Ui+i} U {ui}, if Ui G L{T) \ Z*{T), 



Qt = {C{u'), R{uH), R{u'C{u')))^,^T^\9T, 



Af{I):={ueT,dT{u,I) = l}. 




if n,GZ'^(r), 



S^ = {ve ATiju^, 0]), ^v}u {u^}. 
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The forests we consider are 

diiT) := iTu,u£ Si); 

we denote by ^{T) = (5^.j(r))j=o,...,K the (ordered) sequence of forests. 

Let / S [0, K — 1| be fixed. Let u'l = Vq,Vi, . . . ,v'^ = uf^i, the shortest path 
in between u'l and uf^i- Let v\ = u\\J vf^j^^ be the deepest (youngest) 
common ancestor of and uf+i- Since Vq -< v'^^, two cases arise, v'l = Vq or 
< i < m. We have 

(22) 

= Mi,i+iiT\ 4, A{T)) + yi,i+i{T\ 4, e(T)), 

where, for any I, Mij+i{T'' , Ij,, A{T)) counts the number of subtrees rooted 
on the neighbors of the spanned branches visited between ui and n^+i, ac- 
cording that these subtrees are on the right or on the left of these spanned 
branches: 

i—l m 

AA;,;+l(^^4,:^(T)) = 5]iV2(lL(„^))+ y1 ^i(^k))' 

p=0 p=i+l 

and yi^i^iir'', 4, Qt) counts the number of subtrees rooted on the neighbors 
of the nodes of Z(r) U {0}: 

i— 1 m—l 

+ j2[civi)-Rivicivi))]+ y: m^D-M- 

p=l p=i+l 

Indeed, i?(uf^_^) — i?(t;f_^)lj>o — 1 is the number of children of vf in Si, the 
sum X]p=i [C (Vp) — R{VpC (Vp))] counts the number of children of v^, . . . , vf_-^ 
in Si, and Epi+i[^(^^p) " 1] the children of vf^^, ...,vl_^ in Si. 

For / = K, let li^ = ^0,^1, ■ ■ ■ ,Vm = ""k+i = ^- Since is an ancestor of 
■^i*-!, we have 

m—l 

Mi,i+^{T\4,liT))=Y^N2il^,^)), 

p=0 
m 

yi,i^,{T\4,QT) = E[c(^p) - R{vlC{vl))], 
p=l 

notice that the term p = m concerns the root. 
The cardinalities Fi (T) = # (T) satisfy 

(23) Fi{nT,T') = (Lns;+iJ - [nsi\+l) - (|n,+i| - \u^+^\) - ^^g^,. 
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since the visit times of the nodes Ui are \nsi\ and since |tt/+i| — + 1 

nodes visited during [ns;, ns/_|_i] are not in U«g5; 

For any tree t having n nodes, P(T = t) = Y\u^tPcu{t) - Hence, if t has n 
nodes, putting together the contribution of the forests and of the ordered 
edge contents, we get 

P(T = () = ( n nUs{,\A{i)m)) = Si ('))) 

(24) ^ ' 

\vb(^tb\{0}k,j / \zb&b\dt'' / 

where is a random forest with x roots (see Section 2.2). 
Comment 1. A few points worth mentioning: 

- Let A = (^fcj)(fc,j)G/K with X^ie/^ -^i = h. The number of ordered con- 
tents having A as content is 




(25) #7r~HA) = 

and thus, 

This is simply due to the fact that ah permutations of the "symbols" (A:, j)'s 
are possible in the ordered contents ^'s such that 7r(^) = A. 

- The size of the forests ^i, as well as their number of trees, is a function 
of the contents (it does not depend on the order of the content). 

We are now able to express the probability to observe a shape together 
with the (ordered) contents in terms of the probability that some forests 
have some prescribed sizes. 

The probability to observe some contents will be obtained by summing 
on all corresponding ordered contents, thanks to (26). We won't do this 
job on all possible shapes since asymptotically only "the simplest shapes" 
eventually happen. We first state a result in this direction. 

2.3.1. Decomposition of a tree T given At and Let T^^_^ = {T e 
T2n-i, deg(0) = l,VMe r\0, deg(M) € {0,2}} be the set of trees with 2k -I 
edges, with binary branching points (except the root that has only one child). 
Denote by 

v!" = fa{v^),d{u,v)e^/^[M-^,M]]. 
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A tree in An^M has its shape in 72^-1' ^^'^ ^h its spanned branches lengths 
in ^/n[M~^ , M]. By a counting argument, is empty, which means that 
the nodes Ui are sent on the leaves of by 

Lemma 14. For any e > 0, there exists M > such that, for n large 
enough, 

Pn(A„,A/) > 1 - e. 

Proof. This is a consequence of h„^h = 2e/cru and of the properties 

of e: e is a.s. nonnuU on (0, 1) and the local minima of e are a.s. all different 
(the continuum random tree is a.s. a binary tree). □ 

For any T G An^M, sends the nodes of L{T) on the leaves of and the 
nodes of Z[T) on the internal nodes of \ {0}. The branching points are 
distinct ^Z{T) = k—1 and L{T) n Z{T) is empty. Therefore, @t belongs to 
2?^ = 2?2 X T^3~^, where V2 = {{c,x),l < x < c} and V3 = {{c,x,y),l <x< 
y < c}. Note that, under (Hi), is a subset of ''"^ 

2.3.2. A second comparison result. The idea is to compare {A{T),7i'Y, 
eT,T^) under P„ with some simple random variables. Thanks to the previ- 
ous lemma, we will somehow consider only the case is empty. 

Denote by (A,Wn,6n,T^) the r.v. (^(T), Wt, 6t, T^) under P„. Hence, 
An = ("4.^)1=1,...,^ is the sequence of contents. Tin = {T~(-n, ■ ■ ■ I'^n) the se- 
quence of spanned edge lengths, Qn the sequence of branching properties 
of the hinge nodes where K is then the number of spanned branches (the 
number of edges of T^). These sequences were labeled by the nodes of the 
tree T^ but, we may and will consider that they are labeled by integers (the 
LO is a total order). This is equivalent knowing the shape and allows one 
to work also when the shape is not known. 

We define now the 4-tuple (^*,W„,e*,T^) as follows: Tin and have 
the same law as above, and and 0* are described conditionally on Tin and 
T^. Conditionally on Hn = ■ ■ ■ ,'Hn), where K is then the number of 
spanned branches (the number of edges of T^), we have A^ = ("4^*)i=i,...,K 
where the r.v. A^fs are independent with respective distribution Q-^j^. The 
random variable 0* = (6^(i))i=i,...,K has k coordinates that are independent 
of {Tln,A^,T^) and distributed as follows: 

P(G;(1) = if,j')) = /i,o for any (/, j^) G V2, 

P(0n(^) = {f,j\f)) = 2/^io/a2 for any eVs,i> 2. 

Since the mean and the variance under /i are respectively 1 and o"^, these 
formulas define indeed two distributions. In the following OH stands for the 
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vector {Oi, . . . ,Oi). Let 

nsupp(^;,w„„e^,T^). 

The following Proposition generalizes to finite-dimensional distributions the 
Proposition 5. 

Proposition 15. (i) For any e > 0, there exists M such that, for n 
large enough, 

Fn{{A*^,nn,e*n,T'j G r„,M) > i - e. 

(ii) 

■„((^„,W„,e„„T^) = (a,x,0,r&)) 



(27) sup 

(a,x,6»,ri,)er„,A/ 



^„((^^,W„,e^,,n) = (a,x,0,r^)) 



1 



>0. 



In general, F(A,,,n^,e,,,Ti) ^ IP(^*,w„,0*,T(i), since supp(e*) is strictly in- 
cluded in supp(0„) when /i[3,+cxo) > (the variable 0* mimics the coding 
of binary branchings on T,^). In that case, moreover, P„(T^ ^ T2k-\) > fo'^ 
n large enough, and no control of (^^j'^niSn^T^) is provided on Cr^_^. 
But an analogous of Lemmas 10 and 11 can be written by weakening slightly 
the condition Fx„ -< IPy^ of Definition 1 : 

Definition 3. Let (li,l2,---) and {X\,X2,..) be two sequences of 
r.v. taking their values in a Polish space 5. We say that Px„/IPy„ — * 1 
or Xn/I ^Yn — > 1 if for any e > there exists a measurable set A^^ and a 
measurable function fn'^n ^ ^ satisfying Px„ = fn^Yr, on A^, such that 
suPi:gyie |/^(a;) — 1| — > and such that Fy„{A^) > 1 — e for n large enough. 

Lemma 16. Assume that Xn//^Yn — > 1, then: 

n n 

• -Lei (gn) be a sequence of continuous functions from S into a Polish space 
S'. IfXn/UYn ^ I, then gn{Xn) / Ugn{Yn) ^ 1. 

The proof is the same as those of Lemmas 10 and 11 and is left to the 
reader. 



Proof of Proposition 15. (i) Let e > be fixed. By Lemma 14, there 
exists M such that P(A„^jv/) > 1 — e/2. Now conditionally on T G ^n,M^ the 
W^'s belongs to [^/n/M, -y/nM], and then the multinomial random variables 
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All* by Lemma 9 satisfy together A^i* G JhCHn) with probabihty arbitrary 
close to 1 (for any M, when n is large enough). 

We examine now (ii). Let {a,x,e,T^) G r„,M for = ((ji , jj), 0'° , j2> jD^ 

• • • ) ( Jk ) ) Jft ) ) • 

Using (24) and Comment 1, 

P((A,W„,e„,T^) = (a,x,0,r'')) 

^^^^ ^ (n^:r'Qx.(aO)(nti/^i°mif|i(,.,^,3,,)i=-?^/(x,r^),o<^<^) 

P(|T|=n) 

where the f^'^'s are independent forests. 
On the other hand, 

P((^;,W„,e;,T^) = (a,x,0,r^)) 

= P((^^,e:;) = (a,0)|(W„,T^) = ixy))F{{nn,Ti) = (x,r^)) 

= ( n '^x^xa.)) [fif^j^^ (|:)'^~''^((^-^') = 

summing formula (28) on all possible values of the aj's and the 0's leads to 
P((?^„,T^) = (x,r'')) 



P(|T|=n) 

where m = (rrij),— i^...^^ is a vector of k multinomial independent r.v. (the 
parameters of rrij are Xj and p), and where 9 = (^(i))ie[i,K| =^'^^ &n ^^'^ 
independent of m. Hence, for (a,x, ^,t'') G Tn,M, 

P((A,?^„,a„,T^) = (a,x,g,r^)) 
^^^^ P((^^,W„,e^,n) = (a,x,0,T^)) 

_P(|f|i(..,,,3,.)l = ^Kx,r''),0<Z<K) 



It is easy to check that for any (a,x, r^,^) in F^ jv/) any /, for n large 
enough, 



\Fi{x,T')-nisi+i-si)\<n^/^, 



2 



since (1/2)^/^ < 5/12. This allows to approximate, on one hand, Fi{x,t^) by 

2 

n(si+i — si) and, on the other hand, #Si{T^,a,d) by ^(i(ui,u;+i) on rn,M 
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since n^/^^ = o(n^/^), the order of d{ui,ui^i). By Otter and the central local 
limit theorem and thanks also to a decomposition of the denominator along 
{m S ri'^Xi} or in its complements (as in the proof of Proposition 12), we 
get 



(31) sup 



|f|i{.^.,a,.)l=^Kx,r''),0</<K; ^ 



^0. 

□ 



2.3.3. Proof of the convergence of the finite- dimensional distribution in 
Theorem 2. We now show that Proposition 15 and Lemma 16 imply the 
convergence of the finite-dimensional distributions in Theorem 2. The proof 
is similar to that of Corollary 13. 

Thanks to the Skorohod representation theorem ([14], Theorem 3.30), 
there exists a probability space on which the convergence of h„ to h is 
a.s. On 0, the vector 

Vn = (h„(si),h„(si,S2),h„(s2),h„(s2,S3), . . . ,h„(s«;)), 

which determines T^, as well as the length of the spanned branches, con- 
verges a.s. to 

Voo = (h(si),h(si,S2),h(s2),h(s2,S3),.-.,h(sK)), 

which determines Tg the ordered discrete subtree of the continuum random 
tree Too, with contour process h, spanned by the root and the nodes visited 
at times si, . . . , s^- The edge lengths of Tg are given by the normalized Brow- 
nian excursion (see Aldous [1, 2]). The coordinates of Vx are distinct and 
nonzero a.s., and then Tg has only binary branching points, and its shape 
Tg belongs to T^_i (we call here shape the tree Tg where the edge lengths 
are somehow fixed to 1). Let TLao = ('Hoo,i)i6[[i,2re-i| be the lengths of the 
(sorted) spanned branches in Too. By the property of the Brownian excur- 
sion, a.s. the coordinate of Woo are almost surely all positive and finite, and 
then there exists M such that all the Woo.i belongs to [M~^,Af] (for a M 
depending on Too)- On 0, 

(32) TrfTg^ 

and therefore, for n large enough, T„ G Tn,2M- 

Denote by (A^)jgui 2k-i] the (sorted) corresponding contents of the spanned 
branches of T„, and by ('H^)jgp 2k-i] their lengths. The normalized contents 
are then given by 

Proposition 15 and Lemma 16 entail that 

' '^n/V^) II ((a« (n, K))^,gp,2K-i] , w„/,A^) - 1, 
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where the r.v 



,7ili)^s are independent, and conditionally on Til, 



I, {g^'\n,ni)) = g{n,l) (these variables were introduced in Section 2.1). 
On U, TCn/y/n—^Tioo- Conditionally on TCoo, by Proposition 3, {g^^\n, 

Wn))ie[i,2K-i] converges in distribution to (^S)je[i,2K-i] , where the ^^•^ 

(i) 

are independent, and is a centered Gaussian vector having as covari- 
ance function 



In order to check that this implies the convergence of the finite-dimensional 
distributions in Theorem 2, it suffices to reconstitute the contents of the 
branches [[0,Uj||'s by summing the contents of the spanned branches they 
contain, and to use that asymptotically, conditionally on TCoo , these contents 
are independent [and that the shape is fixed by (32) for n large enough]. 
Hence, by (33), one easily gets the fact that each Gkj{si) is Gaussian with 
the law described in Theorem 2. Knowing Voo, the limiting G(si)'s are ob- 
tained as sums of independent Gaussian vectors. To compute the covari- 
ance between Gkj{si) and Gk'j'{si') (for Si < Si'), we use that the nodes 
in |0,Uj_j'| are the common ancestors of Uj and Uj/. The contents of the 
branches and Jitj^j/, are asymptotically independent. By (33), 

one then checks that, knowing Voo, the covariance cov{{G^)k,j, (^oo^)fc'j') is 
ruled by the common ancestors, and then equals {—HklJ-k' + l^k'^{k,j)=(k',j')) ^ 
min{h(s),s G □ 

2.4. Tightness in Theorem 2. We only prove the tightness of the family 

(G^"")), since one already knows that (h„) is tight [ since — > h] . In this 

section we assume (Hi) and (H2). 

We collect in the set the trees with n edges having some suitable 

properties: 



(33) 



cov 




TGT„,Vt,sG [0,l],|h„(s)-h„(t)| <5|t-s 



a 



max I \u{l + 1)1 



u(/)|| < plogn, 



u{n)\ < plogn, V(fc,j) G Ik,1£ (0, \u\] 




Lemma 17. For any e >0, a < 1/2, there exists 6 > 0, j > 0, p > 0, s.t. 
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According to Lemmas 7 and 8, and Remark 1, only the condition on the 
Holderienity of H has to be checked. We refer to Marckert and Miermont 
[19], Section 5.2, for a proof of this result. 

Let e > be fixed. Set a = 2/5 and choose 5>0, 7>0, p > Q s.t. 
WiO.'^'^^^^P) > 1 - e for n large enough. For these choices, write instead 
of O^''''-^'^. 

We will establish the following proposition. 

Proposition 18. For any a > 0, there exist P > 0, c> s.t. for any 
sufficiently large n, 

(34) E(||Gi")-G(")||flnJ<c|t-s|i+'^ /or any [0,1]. 

This implies that for any a > the (1 + a)//3-Holder norm of the family 
(G(")) is tight, and then that (G^'^)) is tight in C([0, 1])*-^^^ [recaU that g[,"^ 
is the null vector of M'^-'^]. 

We first point out that, using (20), we get that for any a > there exists 
P>0 such that, for n large enough, E(11G(")(s„) - G^'^\s)\\^1^pJ < c(s - 
SnY'^^- Hence, we can restrict ourself to prove (34) only for s and t such 
that ns and nt are integer (this is classical). Prom now on, we assume that 
s,t are in [0, 1]^ := [0, 1] n N/n, and s / t. 

We set ui =u{[ns\),U2 = u{[nt\), ui^2 their deepest common ancestor 
and Dn{s,t) = d{ui,U2). There exists 6' > 0, such that, for T G Og, any 

S,t £[0,l]n, S^t, 

Dn{s,t)<2 + Hn{nt) + Hn{ns)-2 min Hn{k) 

k(^[ns,nt] 

< 2^5\t - sf^ + 2 < 5'V^\t - sf\ 

Lemma 19. For any a' > 0,a > 0, there exist /? > 0,c > s.t. for any 
s,t £ [0, 1] such that \s — t\ < (logn)"'^, for n large enough, 

(35) E(11GW -G^^^Uflf^J <clt-sli+^ 

Proof. Let s,t G [0, 1]„, s ^t. We use a deterministic bound valid for 
all trees T in O^. Let (fc,j) E Ik fixed. As in the proof of Proposition 4, it 
suffices to show that 

(36) - Pk\ui\ - Au^^k,j + P'k\u2\f < cls-tl^+". 
Passing via ui^2-, the left-hand side of (36) is smaller than 

cin~'^/'^{\Au^^h^^k,j - Atfcl^ill^ + \Au2MKj " /"fcl^2ll^ + 2^), 
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where hi := d{ui,ui^2) — 1 and /i2 := d(u2,'Ui,2) — 1 (the contribution of ui^2 
is bounded by the term 2). Using that |An,/,fc,j — < 7a/' log n for any / 
and l<Dn{s,t)< 6'n^/'^\t - sp/^, we find 



and since \t — s\ < (logn) |t — s|^/^(logn)^/^ < 1 and then C2(t — s)^/^ x 
(logn)^/^ is smaller than \t — s|^+" for j3 and n large enough. □ 

Lemma 20. For any a > 0, there exist /? > 0, c> s.t. for any t G [0, 1], 
for any n large enough, 

(37) E(||Gf)||flnJ<cti+^ 

Proof. First, consider the case t = \. In l^e, we have |u(n)| < plogn 
and then 

and this is smaller than cl^"*"" for any a>0, OO, /?>0 for n large enough. 
By the previous lemma and a simple computation [using that max||n(/ + 
1)1 — |n(/)|| < yologn], one sees that (37) is true if t ^ Vn, where 

K:=[(logn)-3,l-(logn)-3]. 

Assume now that t S Ki- In O^, the Holder property of h„ and the inequality 
|u(n)| < plogn imply that, for t G Vn, 

(38) \u{[nt\)\ <L;^{t) ■.= C2r^'^[t^{l-t)X'. 

For any real number a, we denote by a • ;U the vector {afik)(k,j)£iK- Using 
(9) and (13), there exists C3 > such that, for t £Vn and n large enough, 



IE(l|Gi")||?lf.. 



<^3 E E 



a 



\a-h-fi\\^ 



/i<L„(t)aeN[^j 

X P(|f^,(3)| = [nt\ - /i,|f{+^^(3)| = n + 1 - [nt\) 

and by Otter [23] 
^4 E E 

h<L„it) a6N[^j 

^ iVi(a)(l + iV2(a))P(WLn*J-/^ = iVi(a))IP(^n-M+i = 1 + A^2(a)) 
([ntj - /i)(n - [ntj + 1) 
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where (Wfc) is the random walk described in the beginning of Section 2.2.1. 
In order to bound these two last probabilities, we use a classical concen- 
tration property valid for any nondegenerate random walk {Wk)k (trivial 
consequence of Petrov [24], Theorem 2.22 p. 76): there exists a constant C5 
such that, for any n > 0, 

(39) sup¥{Wn = y)<C5/V^. 

y 

Now, for any a G ^fh]-' ^^'^ -^2(2) are smaller than Kh, and for any 

h < Ln{t), t £Vn and n large enough, nt — h>nt/2. We then get 

m-nipWll/^lo ^ <r V \^ Qfe(a)||a-/i-/x||f/i2 

MIIW |lllnJ_C6 2. 2. ^/3/4-3/2(LntJ -/i)3/2(„_ L„tJ +1)3/2 • 

Using Proposition 4, we obtain that, for any t G Vn, 

FnipWii/^, w (^(0)"/^+^ ^„ (tA(i-t))/^/^+3 

Remark 2. The last formula implies that, for any a > 0, there exist 
/3 > 0, c> s.t. for any t G V„, for any n large enough, 

(40) mGt\fl^n^<c{t^{l-t)f+\ 

This allows to prove a part of Proposition 18: since E(||Gj"^ — gI"^ |Ii In^) !^ 
cE(ln^ (||gJ") II? + l|Gi"^ llf )) when s, t G K and s < t, 

- a s<t- s [in this case t<2{t- s)], then E(||gJ"^ - Gt\\itn,) < 
c(^-s)^+^ 

- if 1 — f < t — s [in this case 1 — s < 2(t — s)], then 

E(||Gi") - G(")||f InJ < c((1 - t)^+'^ + (1 - sy+-) < C2{t - s)i+^ 

Thanks to this remark, only the case s,t £ Vn, s <t, and 

(41) [sA(l-s)] >t-s and [t A {I - t)] >t - s 

remains to be checked. So assume that s and t satisfy these constraints. 

Consider An = {Ai,Al,Al) = (^(iii,2,ni), %i,2,«2)' ^(0,«i,2)) ^^e contents 
of the "three" spanned branches in Tg2 (some of these spanned branches 
may be empty). We have 

E(||Gi")-G(")||Xj 



< 



E E 



FniAl, = ai,i = 1,2, 3)[||ai - hi ■ /x||'^ + ||a2 - /12 • /u||^] 



n/3/4 

hl,h2,h3 31,32,33 
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where the first sum is taken on hi + < Ln{s), h2 + h-^ < Ln{t), /ii + /12 < 
R^{s, t) := 6'n^/'^\t - s|° where J^ix) is given in (38). By (24), Comment 1 
and the Otter formula, /ii, /i2, /is, ai, 82, 83 fixed, 

(43) P.(-4j, = a„ z = 1, 2, 3) < cn^' sup {[ Q.. (3^) W^^. = W) ^ 

^ i=i 

where the supremum is taken on 9 = (^1,^25^3) G and where Fi = 

ns + l-\u{[ns\)\-l,F2 = n{t-s) + l-{\u{[nt\)\-\ui^2\),F:i=n{l-t) + l, 
5i = TVi (33) + A^i (a 1 ) + ^1 , ^2 = iV2 (a 1 ) + A^i (32 ) + ^2 , S3 = Af2 (32 ) + iV2 (as) + 

We plug this bound in (42), and bound the left-hand side using the fol- 
lowing ingredient: 

- the probabilities in (43) involving the random walks are bounded using 
(39). 

- for a G ^[hy ^i(a) and then for a constant c > 0, 

5i <K|n([nsJ)| +^1 <ci;(s), 

S:i<K\u{[nt\)\+e-i<cL~n{t). 

The denominators are bounded using \ t — s\ > (logn)^"^, [t A (1 — t)] > (logn)~^ 
[s A (1 — s)] > (logn)"'^, and then for n large enough, 

Fi>ns/2, F2>n{t-l)/2, F3 > ri(l - t)/2. 

Finally, we get that the left-hand side of (42) is smaller than 

C< Ln{s)Ln{t)Dn{s,t) 



X 

h 



E E nQh,(30[||3l-/ll-/i||?+||32-/l2./x||f]| 



X {n^/4-3/2[n3(s A (1 - s)){t A (1 - t))(t - s)f'^]-^ . 
The double sum is smaller than 

hi,h2,hz 

this last factor Ln{s) being a bound of /13. Finally, 



,(n)||/3. ^ . _3/2-/3/4 (L„ (g))^L„(t) (Z^^ (g, ^))/^/' 

^•^(s A (1 

By (41), it suffices to take j3 large enough. □ 



2+3 

E(||Gi ^ - IMnJ < cn ^^3^^ ^ _ ^^^^^ ^ _ ^^^^^ _ ^^^3/^ , 
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2.5. Proof of Theorem 1. Consider the representation of i{u) given in 
(2). For any s such that ns is an integer, 

(44) r„(.)=r«(.)+r(f)(.), 
where 

ik,j)eiK 1=1 

r'hHs) = n~'^^'^ E i^u{ns),k,j - ^^k\u{ns)\)mk,j = {G^'^\s),rft), 
{k,j)eiK 

where rft = {mkj)(^k,j)elK ("'^) = T,{k,j)elK '^k,jh,j- For s in [i/n,(i + 
l)/,n], r^i^^(s) and r^i^^(s) are defined by hnear interpolation. Since h^^h 

in C[0, 1], by the Skorohod representation theorem [14], Theorem 3.30, there 
exists a probabihty space on which this convergence is a.s. On this space 
by Theorem 2, G^"^ converges in distribution in C([0, 1])*^^' to G*^, where 
G*^ has the distribution of G knowing h. Now, since the apphcation 

M/^:C([0,1])#^-^(C7[0,1]), 

(s^ff(s)) 1 — > {sh^ {g{s),rn)) 

is continuous, on we have 

(45) (G("),7ff)^r(2) -(G*^,??!) 

in C([0, 1]). On 17, r^^^ is a centered Gaussian process with covariance func- 
tion 

cov(r(2)(s),r(2)(t)) 

= h(s,t) E E i-fJ-kfJ-k' + fJ'kt(k,j)={k',j'))mk,jmk^j^. 

ik,j)GlK{k',j')£lK 

On 17 (or on an enlarged space), is the standard head of a discrete snake 
associated with independent centered displacements. As shown in [19], under 
(Hi) and (H2), 

(46) r«^r« 

in C([0, 1],M), where r^^^ given h is a centered Gaussian process with co- 
variance function 

cov(r«(s),r«(t)) = h(s,t) E ^'^^<r 

{k,j)eiK 
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It remains to prove that, given h, the finite-dimensional distributions of 
r'-^-' and r*^^^ are independent. We establish the "asymptotic independence" 
between the two processes and knowing h. The arguments are quite 
straightforward; we just explicit the uni-dimensional case. Let 

T: = {T G Tn,y{k,j) €lK,ue T, \A^^k,j - ^^k\u\\ < ni/4+-}. 

According to Lemma 9 in [19], for any i/ > 0, e > 0, if n is large enough, 
Fnil^) >1 — s. Letting s G [0,1] (such that ns is an integer), one may 
compare 

LMfc|«(LnsJ)|-nl/4+-J 

with rn\s), where the same r.v. Y^^j are involved in both and r^^ 
Knowing |w([nsj)|, r'„(s) is independent of Vn (s) since r'„(s) is a func- 

(2) 

tion of the Y^j^s when rh, is a function of the ^u(ns),fc,j's- We will prove 
that |ri^''(s) — r^(s)|^'^^'0 which is sufficient to deduce that r'-"'^) and r*-^-* 
are independent given h (in the uni-dimensional case): indeed, the distance 
in between (r'„(s), ri^^(s), h„(s)) and {rl^\s),r^n\s),hn{s)) goes to 
in probability; hence, (r^(s), r^^^(s), h„(s)) ^(r^^^ (s), r(^)(s), h(s)) and then 

(r(-^)(s),r(^)(s)) are independent given h(s) since {r'^{s) , rn\s)) are inde- 
pendent given (h„(s)). 
We have 

P„(|r'„(s) - r«| > x) < P(|r:,(.) - r«| > x, T^'^) + P„(T„ \ T^). 

The last term goes to for any u > 0. The Rosenthal inequality ([24], The- 
orem 2.11) asserts that if {Xk)k is a sequence of centered r.v. and q >2, 
then 

/ n / n \ '?/2\ 

<c(g)(^^E(|X,|'?)+(^^var(X,)j j, 

where c(g) is a positive constant depending only on q. For p satisfying (H2), 
we have 



(47) E 



i=l 



(1)| 



P(|r;(.) -r«| >x,T„^) <E(x-^'|r;(.) -r«|^l7;r). 

Conditioning at first by the ^(^(ns)); and using (47), we get P(|r'„(s) — r- 
x,T„-)< 



> 
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and then for v < 1/4, for any x > 0, the bound goes to 0. 

Hence, r is a centered Gaussian process with covariance function sum 
of the ones of r^-^^ and r^^^ Using that Z)(A:,j) ^fc^^fej = m = 0, we get 

cov(r(s),r(i)) = h(s,t)E(fcj)/^A:E(nj)- 
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